Statistics of Pairwise Co-occurring Local Spatio-temporal Features for Human Action Recognition
نویسندگان
چکیده
The bag-of-words approach with local spatio-temporal features have become a popular video representation for action recognition in videos. Together these techniques have demonstrated high recognition results for a number of action classes. Recent approaches have typically focused on capturing global statistics of features. However, existing methods ignore relations between features and thus may not be discriminative enough. Therefore, we propose a novel feature representation which captures statistics of pairwise co-occurring local spatio-temporal features. Our representation captures not only global distribution of features but also focuses on geometric and appearance (both visual and motion) relations among the features. Calculating a set of bag-of-words representations with different geometrical arrangement among the features, we keep an important association between appearance and geometric information. Using two benchmark datasets for human action recognition, we demonstrate that our representation enhances the discriminative power of features and improves action recognition performance.
منابع مشابه
Human Action Recognition in Videos Using Stable Features
Human action recognition is still a challenging problem and researchers are focusing to investigate this problem using different techniques. We propose a robust approach for human action recognition. This is achieved by extracting stable spatio-temporal features in terms of pairwise local binary pattern (P-LBP) and scale invariant feature transform (SIFT). These features are used to train an ML...
متن کاملSpatio-temporal Co-Occurrence Characterizations for Human Action Classification
The human action classification task is a widely researched topic and is still an open problem. Many state-of-thearts approaches involve the usage of bag-of-video-words with spatio-temporal local features to construct characterizations for human actions. In order to improve beyond this standard approach, we investigate the usage of co-occurrences between local features. We propose the usage of ...
متن کاملRecognizing Human Actions in Basketball Video Sequences on the Basis of Global and Local Pairwise Representation
A feature-representation method for recognizing actions in sports videos on the basis of the relationship between human actions and camera motions is proposed. The method involves the following steps: First, keypoint trajectories are extracted as motion features in spatio-temporal sub-regions called “spatio-temporal multiscale bags” (STMBs). Global representations and local representations from...
متن کاملAdaptive Tuboid Shapes for Action Recognition
Encoding local motion information using spatio-temporal features is a common approach in action recognition methods. These features are based on the information content inside subregions extracted at locations of interest in a video. In this paper, we propose a conceptually different approach to video feature extraction. We adopt an entropybased saliency framework and develop a method for estim...
متن کاملHuman Action Recognition Using Pyramid Vocabulary Tree
The bag-of-visual-words (BOVW) approaches are widely used in human action recognition. Usually, large vocabulary size of the BOVW is more discriminative for inter-class action classification while small one is more robust to noise and thus tolerant to the intra-class invariance. In this pape, we propose a pyramid vocabulary tree to model local spatio-temporal features, which can characterize th...
متن کامل